Sparse Matrix-Vector Multiplication using Pthreads
نویسندگان
چکیده
Optimizations: 1) Loop Optimization [1]: A typical matrix-vector multiplication (matrix in CSR format) consists of a nested loop where the outer loop iterates over all the rows and the inner loop iterates over columns in those rows. Since the data is stored in a sequential fashion in CSR (one row after the other), the data can be accessed by the nested loop using a single loop variable instead of two (in the typical algorithm). As expected, loop unrolling gave a good performance boost to our initial simple CSR-based algorithm.
منابع مشابه
Sparse Matrix Multiplication on CAM Based Accelerator
Sparse matrix multiplication is an important component of linear algebra computations. In this paper, an architecture based on Content Addressable Memory (CAM) and Resistive Content Addressable Memory (ReCAM) is proposed for accelerating sparse matrix by sparse vector and matrix multiplication in CSR format. Using functional simulation, we show that the proposed ReCAM-based accelerator exhibits...
متن کاملSparse Matrix-vector Multiplication on Nvidia Gpu
In this paper, we present our work on developing a new matrix format and a new sparse matrix-vector multiplication algorithm. The matrix format is HEC, which is a hybrid format. This matrix format is efficient for sparse matrix-vector multiplication and is friendly to preconditioner. Numerical experiments show that our sparse matrix-vector multiplication algorithm is efficient on
متن کاملTransposition Mechanism for Sparse Matrices on Vector Processors
Many scientific applications involve operations on sparse matrices. However, due to irregularities induced by the sparsity patterns, many operations on sparse matrices execute inefficiently on traditional scalar and vector architectures. To tackle this problem a scheme has been proposed consisting of two parts: (a) An extension to a vector architecture to support sparse matrix-vector multiplica...
متن کاملOptimizing Sparse Matrix Vector Multiplication on SMPs
We describe optimizations of sparse matrix-vector multiplication on uniprocessors and SMPs. The optimization techniques include register blocking, cache blocking, and matrix reordering. We focus on optimizations that improve performance on SMPs, in particular, matrix reordering implemented using two diierent graph algorithms. We present a performance study of this algorithmic kernel, showing ho...
متن کاملTowards a fast parallel sparse matrix-vector multiplication
The sparse matrix-vector product is an important computational kernel that runs ineffectively on many computers with super-scalar RISC processors. In this paper we analyse the performance of the sparse matrix-vector product with symmetric matrices originating from the FEM and describe techniques that lead to a fast implementation. It is shown how these optimisations can be incorporated into an ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011